Overview

Dataset statistics

Number of variables32
Number of observations1219694
Missing cells3700183
Missing cells (%)9.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.2 GiB
Average record size in memory1.0 KiB

Variable types

Numeric12
Boolean4
Categorical14
Unsupported2

Alerts

Net_Sales_Proceeds has constant value "False"Constant
Non_MI_Recoveries has constant value "FRM"Constant
Estimated_Loan_to_Value_(ELTV) has constant value "True"Constant
Current_Month_Modification_Cost has constant value "False"Constant
Expenses has a high cardinality: 54 distinct valuesHigh cardinality
Taxes_and_Insurance has a high cardinality: 1219694 distinct valuesHigh cardinality
Monthly_Reporting_Period is highly overall correlated with Current_Loan_Delinquency_StatusHigh correlation
Current_Loan_Delinquency_Status is highly overall correlated with Monthly_Reporting_Period and 1 other fieldsHigh correlation
Loan_Age is highly overall correlated with ExpensesHigh correlation
Remaining_Months_to_Legal_Maturity is highly overall correlated with Zero_Balance_Code and 2 other fieldsHigh correlation
Zero_Balance_Code is highly overall correlated with Remaining_Months_to_Legal_Maturity and 1 other fieldsHigh correlation
Current_Deferred_UPB is highly overall correlated with Remaining_Months_to_Legal_Maturity and 3 other fieldsHigh correlation
Due_Date_of_Last_Paid_Installment_(DDLPI) is highly overall correlated with Actual_Loss_CalculationHigh correlation
Maintenance_and_Preservation_Costs is highly overall correlated with ExpensesHigh correlation
Actual_Loss_Calculation is highly overall correlated with Current_Loan_Delinquency_Status and 1 other fieldsHigh correlation
Current_Actual_UPB is highly overall correlated with Current_Deferred_UPB and 1 other fieldsHigh correlation
MI_Recoveries is highly overall correlated with Step_Modification_Flag and 1 other fieldsHigh correlation
Expenses is highly overall correlated with Loan_Age and 1 other fieldsHigh correlation
Miscellaneous_Expenses is highly overall correlated with Current_Actual_UPBHigh correlation
Step_Modification_Flag is highly overall correlated with MI_Recoveries and 1 other fieldsHigh correlation
Deferred_Payment_Plan is highly overall correlated with MI_Recoveries and 1 other fieldsHigh correlation
Interest_Bearing_UPB is highly overall correlated with Remaining_Months_to_Legal_Maturity and 1 other fieldsHigh correlation
Current_Actual_UPB is highly imbalanced (56.6%)Imbalance
Defect_Settlement_Date is highly imbalanced (92.1%)Imbalance
Modification_Flag is highly imbalanced (67.2%)Imbalance
Modification_Cost is highly imbalanced (54.8%)Imbalance
Delinquent_Accrued_Interest is highly imbalanced (85.1%)Imbalance
Interest_Bearing_UPB is highly imbalanced (52.0%)Imbalance
Loan_Age has 95617 (7.8%) missing valuesMissing
Estimated_Loan_to_Value_(ELTV) has 1165178 (95.5%) missing valuesMissing
Zero_Balance_Removal_UPB has 1219694 (100.0%) missing valuesMissing
Delinquency_Due_to_Disaster has 1219694 (100.0%) missing valuesMissing
Loan_Sequence_Number is highly skewed (γ1 = 76.65065439)Skewed
Monthly_Reporting_Period is highly skewed (γ1 = 41.23688602)Skewed
Taxes_and_Insurance is uniformly distributedUniform
Taxes_and_Insurance has unique valuesUnique
Zero_Balance_Removal_UPB is an unsupported type, check if it needs cleaning or further analysisUnsupported
Delinquency_Due_to_Disaster is an unsupported type, check if it needs cleaning or further analysisUnsupported
Remaining_Months_to_Legal_Maturity has 1006523 (82.5%) zerosZeros

Reproduction

Analysis started2023-10-06 14:00:30.565677
Analysis finished2023-10-06 14:03:22.336422
Duration2 minutes and 51.77 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

Loan_Sequence_Number
Real number (ℝ)

Distinct240
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean758.45945
Minimum300
Maximum9999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.3 MiB
2023-10-06T10:03:22.441953image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum300
5-th percentile677
Q1730
median767
Q3791
95-th percentile810
Maximum9999
Range9699
Interquartile range (IQR)61

Descriptive statistics

Standard deviation98.111925
Coefficient of variation (CV)0.12935685
Kurtosis7222.6701
Mean758.45945
Median Absolute Deviation (MAD)29
Skewness76.650654
Sum9.2508844 × 108
Variance9625.9499
MonotonicityNot monotonic
2023-10-06T10:03:22.624098image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
801 16676
 
1.4%
790 15896
 
1.3%
787 15396
 
1.3%
797 15323
 
1.3%
791 15225
 
1.2%
809 15183
 
1.2%
796 14733
 
1.2%
800 14561
 
1.2%
798 14445
 
1.2%
793 14124
 
1.2%
Other values (230) 1068132
87.6%
ValueCountFrequency (%)
300 1
 
< 0.1%
571 1
 
< 0.1%
600 15
< 0.1%
601 10
< 0.1%
602 13
< 0.1%
603 9
< 0.1%
604 14
< 0.1%
605 18
< 0.1%
606 15
< 0.1%
607 15
< 0.1%
ValueCountFrequency (%)
9999 112
< 0.1%
839 2
 
< 0.1%
835 1
 
< 0.1%
834 5
 
< 0.1%
833 2
 
< 0.1%
832 58
 
< 0.1%
831 12
 
< 0.1%
830 26
 
< 0.1%
829 195
< 0.1%
828 60
 
< 0.1%

Monthly_Reporting_Period
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct23
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean202104.11
Minimum202102
Maximum202301
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.3 MiB
2023-10-06T10:03:22.779052image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum202102
5-th percentile202103
Q1202103
median202104
Q3202105
95-th percentile202105
Maximum202301
Range199
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.9899804
Coefficient of variation (CV)9.8463134 × 10-6
Kurtosis2090.9932
Mean202104.11
Median Absolute Deviation (MAD)1
Skewness41.236886
Sum2.4650517 × 1011
Variance3.9600219
MonotonicityNot monotonic
2023-10-06T10:03:22.946093image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
202105 439802
36.1%
202104 395399
32.4%
202103 369832
30.3%
202106 12147
 
1.0%
202102 1943
 
0.2%
202203 97
 
< 0.1%
202202 92
 
< 0.1%
202201 87
 
< 0.1%
202111 56
 
< 0.1%
202204 52
 
< 0.1%
Other values (13) 187
 
< 0.1%
ValueCountFrequency (%)
202102 1943
 
0.2%
202103 369832
30.3%
202104 395399
32.4%
202105 439802
36.1%
202106 12147
 
1.0%
202107 5
 
< 0.1%
202108 12
 
< 0.1%
202109 20
 
< 0.1%
202110 30
 
< 0.1%
202111 56
 
< 0.1%
ValueCountFrequency (%)
202301 1
 
< 0.1%
202211 1
 
< 0.1%
202210 7
 
< 0.1%
202209 7
 
< 0.1%
202208 5
 
< 0.1%
202207 10
 
< 0.1%
202206 15
 
< 0.1%
202205 27
 
< 0.1%
202204 52
< 0.1%
202203 97
< 0.1%

Current_Actual_UPB
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.2 MiB
False
1110746 
True
 
108948
ValueCountFrequency (%)
False 1110746
91.1%
True 108948
 
8.9%
2023-10-06T10:03:23.094353image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Distinct283
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean204743.05
Minimum202803
Maximum205212
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.3 MiB
2023-10-06T10:03:23.244291image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum202803
5-th percentile203602
Q1204402
median205102
Q3205104
95-th percentile205104
Maximum205212
Range2409
Interquartile range (IQR)702

Descriptive statistics

Standard deviation616.82575
Coefficient of variation (CV)0.0030126822
Kurtosis-0.20495481
Mean204743.05
Median Absolute Deviation (MAD)2
Skewness-1.2566847
Sum2.4972387 × 1011
Variance380474
MonotonicityNot monotonic
2023-10-06T10:03:23.417805image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
205104 313160
25.7%
205103 286612
23.5%
205102 276290
22.7%
203604 82566
 
6.8%
203603 68469
 
5.6%
203602 57573
 
4.7%
204104 27650
 
2.3%
204103 24270
 
2.0%
204102 21452
 
1.8%
205105 7433
 
0.6%
Other values (273) 54219
 
4.4%
ValueCountFrequency (%)
202803 1
 
< 0.1%
202804 3
 
< 0.1%
202805 1
 
< 0.1%
202806 1
 
< 0.1%
202808 1
 
< 0.1%
202901 1
 
< 0.1%
202902 70
< 0.1%
202903 94
< 0.1%
202904 127
< 0.1%
202905 3
 
< 0.1%
ValueCountFrequency (%)
205212 1
 
< 0.1%
205210 1
 
< 0.1%
205209 7
 
< 0.1%
205208 7
 
< 0.1%
205207 3
 
< 0.1%
205206 7
 
< 0.1%
205205 13
 
< 0.1%
205204 26
 
< 0.1%
205203 45
< 0.1%
205202 93
< 0.1%

Loan_Age
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct408
Distinct (%)< 0.1%
Missing95617
Missing (%)7.8%
Infinite0
Infinite (%)0.0%
Mean30517.519
Minimum10180
Maximum49740
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.3 MiB
2023-10-06T10:03:23.586572image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum10180
5-th percentile12060
Q119430
median33340
Q340140
95-th percentile47664
Maximum49740
Range39560
Interquartile range (IQR)20710

Descriptive statistics

Standard deviation11322.608
Coefficient of variation (CV)0.37101994
Kurtosis-1.2287425
Mean30517.519
Median Absolute Deviation (MAD)8600
Skewness-0.1987437
Sum3.4304041 × 1010
Variance1.2820146 × 108
MonotonicityNot monotonic
2023-10-06T10:03:23.774304image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31084 36818
 
3.0%
16984 34887
 
2.9%
38060 34623
 
2.8%
47894 31034
 
2.5%
12060 25815
 
2.1%
19740 22853
 
1.9%
40140 22464
 
1.8%
33460 20612
 
1.7%
42644 20347
 
1.7%
26420 19886
 
1.6%
Other values (398) 854738
70.1%
(Missing) 95617
 
7.8%
ValueCountFrequency (%)
10180 310
 
< 0.1%
10380 1
 
< 0.1%
10420 2108
0.2%
10500 130
 
< 0.1%
10540 569
 
< 0.1%
10580 1877
0.2%
10740 2832
0.2%
10780 205
 
< 0.1%
10900 2689
0.2%
11020 235
 
< 0.1%
ValueCountFrequency (%)
49740 660
 
0.1%
49700 786
 
0.1%
49660 985
 
0.1%
49620 1687
 
0.1%
49500 1
 
< 0.1%
49420 556
 
< 0.1%
49340 4399
0.4%
49180 1715
 
0.1%
49020 633
 
0.1%
48900 1417
 
0.1%

Remaining_Months_to_Legal_Maturity
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.0664224
Minimum0
Maximum55
Zeros1006523
Zeros (%)82.5%
Negative0
Negative (%)0.0%
Memory size9.3 MiB
2023-10-06T10:03:23.937576image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile30
Maximum55
Range55
Interquartile range (IQR)0

Descriptive statistics

Standard deviation9.3919546
Coefficient of variation (CV)2.3096358
Kurtosis2.5661946
Mean4.0664224
Median Absolute Deviation (MAD)0
Skewness2.0621656
Sum4959791
Variance88.208811
MonotonicityNot monotonic
2023-10-06T10:03:24.085894image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
0 1006523
82.5%
25 79157
 
6.5%
30 69593
 
5.7%
12 50326
 
4.1%
35 6065
 
0.5%
6 5495
 
0.5%
18 1396
 
0.1%
16 1090
 
0.1%
15 12
 
< 0.1%
20 6
 
< 0.1%
Other values (21) 31
 
< 0.1%
ValueCountFrequency (%)
0 1006523
82.5%
6 5495
 
0.5%
12 50326
 
4.1%
14 2
 
< 0.1%
15 12
 
< 0.1%
16 1090
 
0.1%
18 1396
 
0.1%
20 6
 
< 0.1%
22 1
 
< 0.1%
23 1
 
< 0.1%
ValueCountFrequency (%)
55 1
 
< 0.1%
50 1
 
< 0.1%
49 1
 
< 0.1%
47 1
 
< 0.1%
45 2
< 0.1%
43 1
 
< 0.1%
42 1
 
< 0.1%
38 1
 
< 0.1%
37 1
 
< 0.1%
36 4
< 0.1%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size67.5 MiB
1
1196476 
2
 
16687
3
 
3616
4
 
2915

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1219694
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1196476
98.1%
2 16687
 
1.4%
3 3616
 
0.3%
4 2915
 
0.2%

Length

2023-10-06T10:03:24.238500image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-06T10:03:24.364931image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
1 1196476
98.1%
2 16687
 
1.4%
3 3616
 
0.3%
4 2915
 
0.2%

Most occurring characters

ValueCountFrequency (%)
1 1196476
98.1%
2 16687
 
1.4%
3 3616
 
0.3%
4 2915
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1219694
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1196476
98.1%
2 16687
 
1.4%
3 3616
 
0.3%
4 2915
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 1219694
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1196476
98.1%
2 16687
 
1.4%
3 3616
 
0.3%
4 2915
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1219694
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1196476
98.1%
2 16687
 
1.4%
3 3616
 
0.3%
4 2915
 
0.2%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size67.5 MiB
P
1111055 
I
 
65538
S
 
43101

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1219694
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowP
2nd rowP
3rd rowP
4th rowP
5th rowS

Common Values

ValueCountFrequency (%)
P 1111055
91.1%
I 65538
 
5.4%
S 43101
 
3.5%

Length

2023-10-06T10:03:24.500842image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-06T10:03:24.626386image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
p 1111055
91.1%
i 65538
 
5.4%
s 43101
 
3.5%

Most occurring characters

ValueCountFrequency (%)
P 1111055
91.1%
I 65538
 
5.4%
S 43101
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1219694
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 1111055
91.1%
I 65538
 
5.4%
S 43101
 
3.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 1219694
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 1111055
91.1%
I 65538
 
5.4%
S 43101
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1219694
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P 1111055
91.1%
I 65538
 
5.4%
S 43101
 
3.5%

Zero_Balance_Code
Real number (ℝ)

Distinct104
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean68.428521
Minimum3
Maximum999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.3 MiB
2023-10-06T10:03:24.776554image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile36
Q158
median71
Q380
95-th percentile95
Maximum999
Range996
Interquartile range (IQR)22

Descriptive statistics

Standard deviation17.032429
Coefficient of variation (CV)0.24890833
Kurtosis7.3123924
Mean68.428521
Median Absolute Deviation (MAD)10
Skewness-0.46031166
Sum83461856
Variance290.10363
MonotonicityNot monotonic
2023-10-06T10:03:24.959817image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
80 159436
 
13.1%
75 75649
 
6.2%
95 65453
 
5.4%
70 55248
 
4.5%
60 46423
 
3.8%
90 40932
 
3.4%
69 29352
 
2.4%
74 27123
 
2.2%
79 25466
 
2.1%
85 24358
 
2.0%
Other values (94) 670254
55.0%
ValueCountFrequency (%)
3 1
 
< 0.1%
4 3
 
< 0.1%
5 15
 
< 0.1%
6 29
 
< 0.1%
7 62
 
< 0.1%
8 76
 
< 0.1%
9 120
 
< 0.1%
10 163
< 0.1%
11 224
< 0.1%
12 318
< 0.1%
ValueCountFrequency (%)
999 1
 
< 0.1%
105 18
 
< 0.1%
104 29
 
< 0.1%
103 53
 
< 0.1%
102 235
 
< 0.1%
101 308
 
< 0.1%
100 450
 
< 0.1%
99 89
 
< 0.1%
98 49
 
< 0.1%
97 16119
1.3%
Distinct52
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.008357
Minimum1
Maximum999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.3 MiB
2023-10-06T10:03:25.138693image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile16
Q126
median34
Q341
95-th percentile48
Maximum999
Range998
Interquartile range (IQR)15

Descriptive statistics

Standard deviation10.993696
Coefficient of variation (CV)0.33305796
Kurtosis1562.3705
Mean33.008357
Median Absolute Deviation (MAD)8
Skewness17.60132
Sum40260095
Variance120.86135
MonotonicityNot monotonic
2023-10-06T10:03:25.318693image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
45 43273
 
3.5%
44 42989
 
3.5%
43 42843
 
3.5%
42 42243
 
3.5%
41 41411
 
3.4%
39 41142
 
3.4%
38 41110
 
3.4%
40 41045
 
3.4%
37 40670
 
3.3%
36 40612
 
3.3%
Other values (42) 802356
65.8%
ValueCountFrequency (%)
1 64
 
< 0.1%
2 113
 
< 0.1%
3 171
 
< 0.1%
4 241
 
< 0.1%
5 382
 
< 0.1%
6 645
 
0.1%
7 968
 
0.1%
8 1511
0.1%
9 2287
0.2%
10 3411
0.3%
ValueCountFrequency (%)
999 32
 
< 0.1%
51 6
 
< 0.1%
50 20718
1.7%
49 23955
2.0%
48 23305
1.9%
47 23574
1.9%
46 25059
2.1%
45 43273
3.5%
44 42989
3.5%
43 42843
3.5%

Current_Interest_Rate
Real number (ℝ)

Distinct1071
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean291095.95
Minimum12000
Maximum1582000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.3 MiB
2023-10-06T10:03:25.494726image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum12000
5-th percentile100000
Q1179000
median263000
Q3375000
95-th percentile548000
Maximum1582000
Range1570000
Interquartile range (IQR)196000

Descriptive statistics

Standard deviation148360.86
Coefficient of variation (CV)0.50966307
Kurtosis0.96798377
Mean291095.95
Median Absolute Deviation (MAD)94000
Skewness0.95308568
Sum3.5504799 × 1011
Variance2.2010944 × 1010
MonotonicityNot monotonic
2023-10-06T10:03:25.954407image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
548000 19487
 
1.6%
200000 10222
 
0.8%
300000 9045
 
0.7%
150000 7903
 
0.6%
250000 7461
 
0.6%
240000 6464
 
0.5%
510000 6272
 
0.5%
280000 6094
 
0.5%
100000 6038
 
0.5%
180000 5961
 
0.5%
Other values (1061) 1134747
93.0%
ValueCountFrequency (%)
12000 1
 
< 0.1%
13000 2
 
< 0.1%
14000 1
 
< 0.1%
15000 2
 
< 0.1%
17000 2
 
< 0.1%
20000 8
 
< 0.1%
21000 4
 
< 0.1%
22000 10
< 0.1%
23000 13
< 0.1%
24000 23
< 0.1%
ValueCountFrequency (%)
1582000 4
< 0.1%
1581000 1
 
< 0.1%
1551000 1
 
< 0.1%
1538000 2
< 0.1%
1520000 1
 
< 0.1%
1500000 2
< 0.1%
1495000 1
 
< 0.1%
1493000 1
 
< 0.1%
1478000 1
 
< 0.1%
1473000 1
 
< 0.1%

Current_Deferred_UPB
Real number (ℝ)

Distinct96
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean68.199353
Minimum3
Maximum101
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.3 MiB
2023-10-06T10:03:26.143631image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile36
Q158
median70
Q380
95-th percentile95
Maximum101
Range98
Interquartile range (IQR)22

Descriptive statistics

Standard deviation17.049955
Coefficient of variation (CV)0.25000171
Kurtosis0.001959364
Mean68.199353
Median Absolute Deviation (MAD)10
Skewness-0.58672812
Sum83182342
Variance290.70096
MonotonicityNot monotonic
2023-10-06T10:03:26.323829image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
80 158893
 
13.0%
75 76040
 
6.2%
95 65353
 
5.4%
70 55505
 
4.6%
60 47038
 
3.9%
90 39526
 
3.2%
69 29492
 
2.4%
74 27128
 
2.2%
79 25161
 
2.1%
59 23800
 
2.0%
Other values (86) 671758
55.1%
ValueCountFrequency (%)
3 1
 
< 0.1%
4 3
 
< 0.1%
5 15
 
< 0.1%
6 30
 
< 0.1%
7 63
 
< 0.1%
8 77
 
< 0.1%
9 124
 
< 0.1%
10 169
< 0.1%
11 230
< 0.1%
12 329
< 0.1%
ValueCountFrequency (%)
101 1
 
< 0.1%
97 16871
 
1.4%
96 327
 
< 0.1%
95 65353
5.4%
94 3815
 
0.3%
93 4065
 
0.3%
92 4047
 
0.3%
91 2920
 
0.2%
90 39526
3.2%
89 7695
 
0.6%
Distinct1637
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.7973943
Minimum1.5
Maximum6.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.3 MiB
2023-10-06T10:03:26.495367image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1.5
5-th percentile2.25
Q12.625
median2.875
Q32.99
95-th percentile3.375
Maximum6.5
Range5
Interquartile range (IQR)0.365

Descriptive statistics

Standard deviation0.33631568
Coefficient of variation (CV)0.12022462
Kurtosis2.7277912
Mean2.7973943
Median Absolute Deviation (MAD)0.125
Skewness0.49359934
Sum3411965.1
Variance0.11310823
MonotonicityNot monotonic
2023-10-06T10:03:26.668166image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.875 251124
20.6%
2.75 206922
17.0%
2.625 109609
9.0%
2.5 104228
8.5%
2.99 92882
 
7.6%
3 70753
 
5.8%
2.375 62782
 
5.1%
2.25 58164
 
4.8%
3.125 53926
 
4.4%
3.25 35000
 
2.9%
Other values (1627) 174304
14.3%
ValueCountFrequency (%)
1.5 2
 
< 0.1%
1.625 3
 
< 0.1%
1.75 1950
0.2%
1.751 1
 
< 0.1%
1.756 1
 
< 0.1%
1.766 1
 
< 0.1%
1.775 2
 
< 0.1%
1.799 2
 
< 0.1%
1.8 2
 
< 0.1%
1.802 1
 
< 0.1%
ValueCountFrequency (%)
6.5 1
 
< 0.1%
6.125 1
 
< 0.1%
6 1
 
< 0.1%
5.875 3
 
< 0.1%
5.75 1
 
< 0.1%
5.625 2
 
< 0.1%
5.5 31
< 0.1%
5.49 2
 
< 0.1%
5.375 12
 
< 0.1%
5.25 68
< 0.1%

MI_Recoveries
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size67.5 MiB
R
713865 
C
342082 
B
163747 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1219694
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC
2nd rowR
3rd rowR
4th rowR
5th rowR

Common Values

ValueCountFrequency (%)
R 713865
58.5%
C 342082
28.0%
B 163747
 
13.4%

Length

2023-10-06T10:03:26.819498image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-06T10:03:26.947780image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
r 713865
58.5%
c 342082
28.0%
b 163747
 
13.4%

Most occurring characters

ValueCountFrequency (%)
R 713865
58.5%
C 342082
28.0%
B 163747
 
13.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1219694
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 713865
58.5%
C 342082
28.0%
B 163747
 
13.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 1219694
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 713865
58.5%
C 342082
28.0%
B 163747
 
13.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1219694
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 713865
58.5%
C 342082
28.0%
B 163747
 
13.4%
Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.2 MiB
False
1219694 
ValueCountFrequency (%)
False 1219694
100.0%
2023-10-06T10:03:27.061878image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size69.8 MiB
FRM
1219694 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters3659082
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFRM
2nd rowFRM
3rd rowFRM
4th rowFRM
5th rowFRM

Common Values

ValueCountFrequency (%)
FRM 1219694
100.0%

Length

2023-10-06T10:03:27.189525image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-06T10:03:27.306208image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
frm 1219694
100.0%

Most occurring characters

ValueCountFrequency (%)
F 1219694
33.3%
R 1219694
33.3%
M 1219694
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3659082
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
F 1219694
33.3%
R 1219694
33.3%
M 1219694
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 3659082
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
F 1219694
33.3%
R 1219694
33.3%
M 1219694
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3659082
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
F 1219694
33.3%
R 1219694
33.3%
M 1219694
33.3%

Expenses
Categorical

HIGH CARDINALITY  HIGH CORRELATION 

Distinct54
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size68.6 MiB
CA
183913 
TX
 
74595
FL
 
72294
IL
 
51483
AZ
 
45836
Other values (49)
791573 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters2439388
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNJ
2nd rowCO
3rd rowDE
4th rowKY
5th rowMA

Common Values

ValueCountFrequency (%)
CA 183913
 
15.1%
TX 74595
 
6.1%
FL 72294
 
5.9%
IL 51483
 
4.2%
AZ 45836
 
3.8%
WA 42037
 
3.4%
OH 40598
 
3.3%
VA 40530
 
3.3%
CO 39707
 
3.3%
MI 39033
 
3.2%
Other values (44) 589668
48.3%

Length

2023-10-06T10:03:27.429393image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ca 183913
 
15.1%
tx 74595
 
6.1%
fl 72294
 
5.9%
il 51483
 
4.2%
az 45836
 
3.8%
wa 42037
 
3.4%
oh 40598
 
3.3%
va 40530
 
3.3%
co 39707
 
3.3%
mi 39033
 
3.2%
Other values (44) 589668
48.3%

Most occurring characters

ValueCountFrequency (%)
A 460542
18.9%
C 291306
11.9%
N 212183
 
8.7%
M 172417
 
7.1%
I 161501
 
6.6%
L 147230
 
6.0%
T 138181
 
5.7%
O 136290
 
5.6%
X 74595
 
3.1%
F 72294
 
3.0%
Other values (14) 572849
23.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2439388
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 460542
18.9%
C 291306
11.9%
N 212183
 
8.7%
M 172417
 
7.1%
I 161501
 
6.6%
L 147230
 
6.0%
T 138181
 
5.7%
O 136290
 
5.6%
X 74595
 
3.1%
F 72294
 
3.0%
Other values (14) 572849
23.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 2439388
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 460542
18.9%
C 291306
11.9%
N 212183
 
8.7%
M 172417
 
7.1%
I 161501
 
6.6%
L 147230
 
6.0%
T 138181
 
5.7%
O 136290
 
5.6%
X 74595
 
3.1%
F 72294
 
3.0%
Other values (14) 572849
23.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2439388
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 460542
18.9%
C 291306
11.9%
N 212183
 
8.7%
M 172417
 
7.1%
I 161501
 
6.6%
L 147230
 
6.0%
T 138181
 
5.7%
O 136290
 
5.6%
X 74595
 
3.1%
F 72294
 
3.0%
Other values (14) 572849
23.5%

Legal_Costs
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size68.6 MiB
SF
784789 
PU
338101 
CO
91157 
MH
 
4190
CP
 
1457

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters2439388
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPU
2nd rowSF
3rd rowSF
4th rowSF
5th rowSF

Common Values

ValueCountFrequency (%)
SF 784789
64.3%
PU 338101
27.7%
CO 91157
 
7.5%
MH 4190
 
0.3%
CP 1457
 
0.1%

Length

2023-10-06T10:03:27.572910image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-06T10:03:27.702445image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
sf 784789
64.3%
pu 338101
27.7%
co 91157
 
7.5%
mh 4190
 
0.3%
cp 1457
 
0.1%

Most occurring characters

ValueCountFrequency (%)
S 784789
32.2%
F 784789
32.2%
P 339558
13.9%
U 338101
13.9%
C 92614
 
3.8%
O 91157
 
3.7%
M 4190
 
0.2%
H 4190
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2439388
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 784789
32.2%
F 784789
32.2%
P 339558
13.9%
U 338101
13.9%
C 92614
 
3.8%
O 91157
 
3.7%
M 4190
 
0.2%
H 4190
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 2439388
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 784789
32.2%
F 784789
32.2%
P 339558
13.9%
U 338101
13.9%
C 92614
 
3.8%
O 91157
 
3.7%
M 4190
 
0.2%
H 4190
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2439388
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 784789
32.2%
F 784789
32.2%
P 339558
13.9%
U 338101
13.9%
C 92614
 
3.8%
O 91157
 
3.7%
M 4190
 
0.2%
H 4190
 
0.2%
Distinct888
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55443.636
Minimum600
Maximum99900
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.3 MiB
2023-10-06T10:03:27.865524image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum600
5-th percentile6600
Q129900
median55100
Q385200
95-th percentile97100
Maximum99900
Range99300
Interquartile range (IQR)55300

Descriptive statistics

Standard deviation30630.112
Coefficient of variation (CV)0.55245497
Kurtosis-1.3493143
Mean55443.636
Median Absolute Deviation (MAD)27800
Skewness-0.12120529
Sum6.7624271 × 1010
Variance9.3820378 × 108
MonotonicityNot monotonic
2023-10-06T10:03:28.045099image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
94500 17057
 
1.4%
85200 13359
 
1.1%
84000 12312
 
1.0%
75000 12126
 
1.0%
30000 12007
 
1.0%
98000 10968
 
0.9%
60000 10567
 
0.9%
85300 10455
 
0.9%
20100 9860
 
0.8%
60600 9438
 
0.8%
Other values (878) 1101545
90.3%
ValueCountFrequency (%)
600 39
 
< 0.1%
700 55
 
< 0.1%
800 87
 
< 0.1%
900 112
 
< 0.1%
1000 1221
0.1%
1100 311
 
< 0.1%
1200 225
 
< 0.1%
1300 146
 
< 0.1%
1400 915
 
0.1%
1500 2368
0.2%
ValueCountFrequency (%)
99900 53
 
< 0.1%
99800 217
 
< 0.1%
99700 179
 
< 0.1%
99600 634
 
0.1%
99500 1117
0.1%
99400 72
 
< 0.1%
99300 1481
0.1%
99200 1714
0.1%
99100 315
 
< 0.1%
99000 819
0.1%

Taxes_and_Insurance
Categorical

HIGH CARDINALITY  UNIFORM  UNIQUE 

Distinct1219694
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size80.3 MiB
F20Q41275812
 
1
F21Q12090498
 
1
F21Q12090496
 
1
F21Q12090495
 
1
F21Q12090494
 
1
Other values (1219689)
1219689 

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters14636328
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1219694 ?
Unique (%)100.0%

Sample

1st rowF20Q41275812
2nd rowF21Q11275813
3rd rowF21Q11275814
4th rowF21Q11275815
5th rowF21Q11275816

Common Values

ValueCountFrequency (%)
F20Q41275812 1
 
< 0.1%
F21Q12090498 1
 
< 0.1%
F21Q12090496 1
 
< 0.1%
F21Q12090495 1
 
< 0.1%
F21Q12090494 1
 
< 0.1%
F21Q12090493 1
 
< 0.1%
F21Q12090492 1
 
< 0.1%
F21Q12090491 1
 
< 0.1%
F21Q12090490 1
 
< 0.1%
F21Q12090489 1
 
< 0.1%
Other values (1219684) 1219684
> 99.9%

Length

2023-10-06T10:03:28.229511image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
f20q41275812 1
 
< 0.1%
f21q11275816 1
 
< 0.1%
f21q11275818 1
 
< 0.1%
f21q11275819 1
 
< 0.1%
f21q11275820 1
 
< 0.1%
f21q11275821 1
 
< 0.1%
f21q11275822 1
 
< 0.1%
f21q11275823 1
 
< 0.1%
f21q11275824 1
 
< 0.1%
f21q11275825 1
 
< 0.1%
Other values (1219684) 1219684
> 99.9%

Most occurring characters

ValueCountFrequency (%)
1 3869502
26.4%
2 2447989
16.7%
F 1219694
 
8.3%
Q 1219694
 
8.3%
3 807148
 
5.5%
4 804763
 
5.5%
8 717300
 
4.9%
9 715024
 
4.9%
7 712029
 
4.9%
6 708246
 
4.8%
Other values (2) 1414939
 
9.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 12196940
83.3%
Uppercase Letter 2439388
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 3869502
31.7%
2 2447989
20.1%
3 807148
 
6.6%
4 804763
 
6.6%
8 717300
 
5.9%
9 715024
 
5.9%
7 712029
 
5.8%
6 708246
 
5.8%
5 707547
 
5.8%
0 707392
 
5.8%
Uppercase Letter
ValueCountFrequency (%)
F 1219694
50.0%
Q 1219694
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12196940
83.3%
Latin 2439388
 
16.7%

Most frequent character per script

Common
ValueCountFrequency (%)
1 3869502
31.7%
2 2447989
20.1%
3 807148
 
6.6%
4 804763
 
6.6%
8 717300
 
5.9%
9 715024
 
5.9%
7 712029
 
5.8%
6 708246
 
5.8%
5 707547
 
5.8%
0 707392
 
5.8%
Latin
ValueCountFrequency (%)
F 1219694
50.0%
Q 1219694
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14636328
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 3869502
26.4%
2 2447989
16.7%
F 1219694
 
8.3%
Q 1219694
 
8.3%
3 807148
 
5.5%
4 804763
 
5.5%
8 717300
 
4.9%
9 715024
 
4.9%
7 712029
 
4.9%
6 708246
 
4.8%
Other values (2) 1414939
 
9.7%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size67.5 MiB
N
665575 
P
277384 
C
276735 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1219694
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowP
2nd rowC
3rd rowN
4th rowN
5th rowN

Common Values

ValueCountFrequency (%)
N 665575
54.6%
P 277384
22.7%
C 276735
22.7%

Length

2023-10-06T10:03:28.382176image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-06T10:03:28.505790image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
n 665575
54.6%
p 277384
22.7%
c 276735
22.7%

Most occurring characters

ValueCountFrequency (%)
N 665575
54.6%
P 277384
22.7%
C 276735
22.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1219694
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 665575
54.6%
P 277384
22.7%
C 276735
22.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 1219694
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 665575
54.6%
P 277384
22.7%
C 276735
22.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1219694
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 665575
54.6%
P 277384
22.7%
C 276735
22.7%

Actual_Loss_Calculation
Real number (ℝ)

Distinct247
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean316.80677
Minimum85
Maximum366
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.3 MiB
2023-10-06T10:03:28.654271image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum85
5-th percentile180
Q1276
median360
Q3360
95-th percentile360
Maximum366
Range281
Interquartile range (IQR)84

Descriptive statistics

Standard deviation74.017242
Coefficient of variation (CV)0.23363529
Kurtosis-0.20440694
Mean316.80677
Median Absolute Deviation (MAD)0
Skewness-1.2570806
Sum3.8640732 × 108
Variance5478.5521
MonotonicityNot monotonic
2023-10-06T10:03:28.822783image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
360 885358
72.6%
180 212052
 
17.4%
240 74428
 
6.1%
300 18499
 
1.5%
120 12464
 
1.0%
324 2272
 
0.2%
348 1824
 
0.1%
144 1135
 
0.1%
336 1124
 
0.1%
312 813
 
0.1%
Other values (237) 9725
 
0.8%
ValueCountFrequency (%)
85 6
 
< 0.1%
90 1
 
< 0.1%
96 293
< 0.1%
97 1
 
< 0.1%
98 1
 
< 0.1%
99 1
 
< 0.1%
100 1
 
< 0.1%
101 2
 
< 0.1%
102 1
 
< 0.1%
106 2
 
< 0.1%
ValueCountFrequency (%)
366 8
 
< 0.1%
365 2
 
< 0.1%
364 1
 
< 0.1%
360 885358
72.6%
359 23
 
< 0.1%
358 3
 
< 0.1%
357 8
 
< 0.1%
356 16
 
< 0.1%
355 19
 
< 0.1%
354 64
 
< 0.1%
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size67.5 MiB
1
631435 
2
580935 
3
 
6252
4
 
1054
5
 
18

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1219694
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row1
4th row2
5th row2

Common Values

ValueCountFrequency (%)
1 631435
51.8%
2 580935
47.6%
3 6252
 
0.5%
4 1054
 
0.1%
5 18
 
< 0.1%

Length

2023-10-06T10:03:28.981481image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-06T10:03:29.113876image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
1 631435
51.8%
2 580935
47.6%
3 6252
 
0.5%
4 1054
 
0.1%
5 18
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
1 631435
51.8%
2 580935
47.6%
3 6252
 
0.5%
4 1054
 
0.1%
5 18
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1219694
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 631435
51.8%
2 580935
47.6%
3 6252
 
0.5%
4 1054
 
0.1%
5 18
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 1219694
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 631435
51.8%
2 580935
47.6%
3 6252
 
0.5%
4 1054
 
0.1%
5 18
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1219694
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 631435
51.8%
2 580935
47.6%
3 6252
 
0.5%
4 1054
 
0.1%
5 18
 
< 0.1%
Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size89.6 MiB
Other sellers
438820 
QUICKEN LOANS, LLC
109365 
WELLS FARGO BANK, N.A.
61002 
PENNYMAC CORP.
60711 
UNITED WHOLESALE MORTGAGE, LLC
58230 
Other values (16)
491566 

Length

Max length41
Median length40
Mean length20.06181
Min length10

Characters and Unicode

Total characters24469269
Distinct characters34
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWELLS FARGO BANK, N.A.
2nd rowOther sellers
3rd rowOther sellers
4th rowOther sellers
5th rowOther sellers

Common Values

ValueCountFrequency (%)
Other sellers 438820
36.0%
QUICKEN LOANS, LLC 109365
 
9.0%
WELLS FARGO BANK, N.A. 61002
 
5.0%
PENNYMAC CORP. 60711
 
5.0%
UNITED WHOLESALE MORTGAGE, LLC 58230
 
4.8%
JPMORGAN CHASE BANK, NATIONAL ASSOCIATION 54860
 
4.5%
LOANDEPOT.COM, LLC 49895
 
4.1%
NATIONSTAR MORTGAGE LLC DBA MR. COOPER 48642
 
4.0%
NEWREZ LLC 44794
 
3.7%
AMERIHOME MORTGAGE COMPANY, LLC 41636
 
3.4%
Other values (11) 251739
20.6%

Length

2023-10-06T10:03:29.275226image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
other 438820
 
12.3%
sellers 438820
 
12.3%
llc 381577
 
10.7%
mortgage 213662
 
6.0%
bank 204126
 
5.7%
loans 138778
 
3.9%
quicken 109365
 
3.1%
n.a 89471
 
2.5%
corporation 83511
 
2.3%
pennymac 76868
 
2.2%
Other values (36) 1393849
39.1%

Most occurring characters

ValueCountFrequency (%)
2349153
 
9.6%
O 1953775
 
8.0%
A 1868194
 
7.6%
N 1621062
 
6.6%
L 1340635
 
5.5%
e 1316460
 
5.4%
E 1305181
 
5.3%
C 1135070
 
4.6%
R 996367
 
4.1%
r 877640
 
3.6%
Other values (24) 9705732
39.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 16353673
66.8%
Lowercase Letter 4827020
 
19.7%
Space Separator 2349153
 
9.6%
Other Punctuation 939423
 
3.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O 1953775
11.9%
A 1868194
11.4%
N 1621062
9.9%
L 1340635
 
8.2%
E 1305181
 
8.0%
C 1135070
 
6.9%
R 996367
 
6.1%
I 841302
 
5.1%
T 812730
 
5.0%
S 656434
 
4.0%
Other values (15) 3822923
23.4%
Lowercase Letter
ValueCountFrequency (%)
e 1316460
27.3%
r 877640
18.2%
s 877640
18.2%
l 877640
18.2%
t 438820
 
9.1%
h 438820
 
9.1%
Other Punctuation
ValueCountFrequency (%)
, 493346
52.5%
. 446077
47.5%
Space Separator
ValueCountFrequency (%)
2349153
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 21180693
86.6%
Common 3288576
 
13.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
O 1953775
 
9.2%
A 1868194
 
8.8%
N 1621062
 
7.7%
L 1340635
 
6.3%
e 1316460
 
6.2%
E 1305181
 
6.2%
C 1135070
 
5.4%
R 996367
 
4.7%
r 877640
 
4.1%
s 877640
 
4.1%
Other values (21) 7888669
37.2%
Common
ValueCountFrequency (%)
2349153
71.4%
, 493346
 
15.0%
. 446077
 
13.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24469269
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2349153
 
9.6%
O 1953775
 
8.0%
A 1868194
 
7.6%
N 1621062
 
6.6%
L 1340635
 
5.5%
e 1316460
 
5.4%
E 1305181
 
5.3%
C 1135070
 
4.6%
R 996367
 
4.1%
r 877640
 
3.6%
Other values (24) 9705732
39.7%
Distinct25
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size92.5 MiB
Other servicers
358972 
JPMORGAN CHASE BANK, NATIONAL ASSOCIATION
111702 
ROCKET MORTGAGE, LLC
82436 
WELLS FARGO BANK, N.A.
60938 
PENNYMAC CORP.
60710 
Other values (20)
544936 

Length

Max length41
Median length32
Mean length22.522049
Min length10

Characters and Unicode

Total characters27470008
Distinct characters36
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWELLS FARGO BANK, N.A.
2nd rowU.S. BANK N.A.
3rd rowU.S. BANK N.A.
4th rowU.S. BANK N.A.
5th rowOther servicers

Common Values

ValueCountFrequency (%)
Other servicers 358972
29.4%
JPMORGAN CHASE BANK, NATIONAL ASSOCIATION 111702
 
9.2%
ROCKET MORTGAGE, LLC 82436
 
6.8%
WELLS FARGO BANK, N.A. 60938
 
5.0%
PENNYMAC CORP. 60710
 
5.0%
UNITED WHOLESALE MORTGAGE, LLC 49831
 
4.1%
NATIONSTAR MORTGAGE LLC DBA MR. COOPER 49592
 
4.1%
MATRIX FINANCIAL SERVICES CORPORATION 38439
 
3.2%
LOANDEPOT.COM, LLC 37050
 
3.0%
NEWREZ LLC 36152
 
3.0%
Other values (15) 333872
27.4%

Length

2023-10-06T10:03:29.443870image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
other 358972
 
9.4%
servicers 358972
 
9.4%
llc 333408
 
8.7%
bank 291813
 
7.7%
mortgage 284035
 
7.4%
corporation 145443
 
3.8%
national 127624
 
3.3%
association 127624
 
3.3%
jpmorgan 111702
 
2.9%
chase 111702
 
2.9%
Other values (40) 1561864
41.0%

Most occurring characters

ValueCountFrequency (%)
2593465
 
9.4%
A 2376307
 
8.7%
O 2331642
 
8.5%
N 1833892
 
6.7%
R 1358338
 
4.9%
C 1319891
 
4.8%
E 1314739
 
4.8%
L 1295231
 
4.7%
I 1190861
 
4.3%
T 1161581
 
4.2%
Other values (26) 10694061
38.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 19286083
70.2%
Lowercase Letter 4666636
 
17.0%
Space Separator 2593465
 
9.4%
Other Punctuation 923824
 
3.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 2376307
12.3%
O 2331642
12.1%
N 1833892
9.5%
R 1358338
 
7.0%
C 1319891
 
6.8%
E 1314739
 
6.8%
L 1295231
 
6.7%
I 1190861
 
6.2%
T 1161581
 
6.0%
S 832811
 
4.3%
Other values (15) 4270790
22.1%
Lowercase Letter
ValueCountFrequency (%)
r 1076916
23.1%
e 1076916
23.1%
s 717944
15.4%
h 358972
 
7.7%
c 358972
 
7.7%
t 358972
 
7.7%
i 358972
 
7.7%
v 358972
 
7.7%
Other Punctuation
ValueCountFrequency (%)
, 489796
53.0%
. 434028
47.0%
Space Separator
ValueCountFrequency (%)
2593465
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23952719
87.2%
Common 3517289
 
12.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 2376307
 
9.9%
O 2331642
 
9.7%
N 1833892
 
7.7%
R 1358338
 
5.7%
C 1319891
 
5.5%
E 1314739
 
5.5%
L 1295231
 
5.4%
I 1190861
 
5.0%
T 1161581
 
4.8%
r 1076916
 
4.5%
Other values (23) 8693321
36.3%
Common
ValueCountFrequency (%)
2593465
73.7%
, 489796
 
13.9%
. 434028
 
12.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27470008
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2593465
 
9.4%
A 2376307
 
8.7%
O 2331642
 
8.5%
N 1833892
 
6.7%
R 1358338
 
4.9%
C 1319891
 
4.8%
E 1314739
 
4.8%
L 1295231
 
4.7%
I 1190861
 
4.3%
T 1161581
 
4.2%
Other values (26) 10694061
38.9%

Estimated_Loan_to_Value_(ELTV)
Boolean

CONSTANT  MISSING 

Distinct1
Distinct (%)< 0.1%
Missing1165178
Missing (%)95.5%
Memory size2.3 MiB
True
 
54516
(Missing)
1165178 
ValueCountFrequency (%)
True 54516
 
4.5%
(Missing) 1165178
95.5%
2023-10-06T10:03:29.565511image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Zero_Balance_Removal_UPB
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing1219694
Missing (%)100.0%
Memory size9.3 MiB
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size67.5 MiB
9
1175053 
H
 
42747
F
 
1894

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1219694
Distinct characters3
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9
2nd row9
3rd row9
4th row9
5th row9

Common Values

ValueCountFrequency (%)
9 1175053
96.3%
H 42747
 
3.5%
F 1894
 
0.2%

Length

2023-10-06T10:03:29.687759image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-06T10:03:29.810795image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
9 1175053
96.3%
h 42747
 
3.5%
f 1894
 
0.2%

Most occurring characters

ValueCountFrequency (%)
9 1175053
96.3%
H 42747
 
3.5%
F 1894
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1175053
96.3%
Uppercase Letter 44641
 
3.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
H 42747
95.8%
F 1894
 
4.2%
Decimal Number
ValueCountFrequency (%)
9 1175053
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1175053
96.3%
Latin 44641
 
3.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
H 42747
95.8%
F 1894
 
4.2%
Common
ValueCountFrequency (%)
9 1175053
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1219694
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 1175053
96.3%
H 42747
 
3.5%
F 1894
 
0.2%

Delinquency_Due_to_Disaster
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing1219694
Missing (%)100.0%
Memory size9.3 MiB
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size67.5 MiB
2
623360 
1
564488 
3
 
31670
9
 
176

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1219694
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row1
4th row2
5th row1

Common Values

ValueCountFrequency (%)
2 623360
51.1%
1 564488
46.3%
3 31670
 
2.6%
9 176
 
< 0.1%

Length

2023-10-06T10:03:29.944520image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-06T10:03:30.072726image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
2 623360
51.1%
1 564488
46.3%
3 31670
 
2.6%
9 176
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
2 623360
51.1%
1 564488
46.3%
3 31670
 
2.6%
9 176
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1219694
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 623360
51.1%
1 564488
46.3%
3 31670
 
2.6%
9 176
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 1219694
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 623360
51.1%
1 564488
46.3%
3 31670
 
2.6%
9 176
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1219694
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 623360
51.1%
1 564488
46.3%
3 31670
 
2.6%
9 176
 
< 0.1%
Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.2 MiB
False
1219694 
ValueCountFrequency (%)
False 1219694
100.0%
2023-10-06T10:03:30.189620image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Interest_Bearing_UPB
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size67.5 MiB
7
1006523 
N
187742 
Y
 
25429

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1219694
Distinct characters3
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row7
2nd row7
3rd rowY
4th row7
5th row7

Common Values

ValueCountFrequency (%)
7 1006523
82.5%
N 187742
 
15.4%
Y 25429
 
2.1%

Length

2023-10-06T10:03:30.316717image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-06T10:03:30.441289image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
7 1006523
82.5%
n 187742
 
15.4%
y 25429
 
2.1%

Most occurring characters

ValueCountFrequency (%)
7 1006523
82.5%
N 187742
 
15.4%
Y 25429
 
2.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1006523
82.5%
Uppercase Letter 213171
 
17.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 187742
88.1%
Y 25429
 
11.9%
Decimal Number
ValueCountFrequency (%)
7 1006523
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1006523
82.5%
Latin 213171
 
17.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 187742
88.1%
Y 25429
 
11.9%
Common
ValueCountFrequency (%)
7 1006523
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1219694
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 1006523
82.5%
N 187742
 
15.4%
Y 25429
 
2.1%

Interactions

2023-10-06T10:03:02.267120image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:35.053647image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:37.373739image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:39.752517image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:42.054956image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:47.067963image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:50.087160image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:52.498639image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:55.023994image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:57.421911image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:59.700253image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:02.453940image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:35.244041image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:37.560589image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:39.935566image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:42.272930image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:47.255142image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:50.282565image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:52.698692image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:55.216217image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:57.606708image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:59.896217image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:02.652987image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:35.448310image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:37.774536image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:40.130840image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:42.511068image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:47.457059image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:50.496949image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:52.922827image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:55.427508image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:57.807912image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:00.112621image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:02.853421image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:35.648908image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:37.978958image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:40.321774image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:42.715932image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:47.654760image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:50.708397image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:53.141177image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:55.635944image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:58.003173image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:00.327115image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:03.263641image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:36.062453image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:38.394940image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:40.725343image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:43.159437image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:48.788508image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:51.126183image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:53.583884image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:56.060827image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:58.388910image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:00.755730image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:03.452031image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:36.242630image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:38.588026image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:40.909995image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:43.372447image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:48.977838image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:51.311978image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:53.782659image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:56.253341image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:58.573327image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:00.947192image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:03.638891image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:36.425070image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:38.775453image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:41.091985image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:43.588718image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:49.158031image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:51.508409image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:53.982081image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:56.445237image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:58.758656image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:01.145810image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:03.839727image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:36.624701image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:38.979465image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:41.290618image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:43.819462image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:49.352452image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:51.715953image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:54.203909image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:56.653762image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:58.957350image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:01.358882image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:04.025884image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:36.806218image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:39.166536image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:41.468115image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:44.029929image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:49.528769image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:51.906300image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:54.401980image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:56.836118image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:59.131565image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:01.689715image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:04.215369image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:37.000077image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:39.365213image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:41.656194image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:44.248948image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:49.712284image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:52.107038image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:54.612814image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:57.037760image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:59.310625image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:01.877428image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:04.397611image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:37.190708image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:39.558239image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:41.840145image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:44.471800image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:49.893679image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:52.305716image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:54.817404image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:57.231667image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:02:59.496207image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-06T10:03:02.080117image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Correlations

2023-10-06T10:03:30.567681image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Loan_Sequence_NumberMonthly_Reporting_PeriodCurrent_Loan_Delinquency_StatusLoan_AgeRemaining_Months_to_Legal_MaturityZero_Balance_CodeZero_Balance_Effective_DateCurrent_Interest_RateCurrent_Deferred_UPBDue_Date_of_Last_Paid_Installment_(DDLPI)Maintenance_and_Preservation_CostsActual_Loss_CalculationCurrent_Actual_UPBDefect_Settlement_DateModification_FlagMI_RecoveriesExpensesLegal_CostsMiscellaneous_ExpensesModification_CostStep_Modification_FlagDeferred_Payment_PlanDelinquent_Accrued_InterestBorrower_Assistance_Status_CodeInterest_Bearing_UPB
Loan_Sequence_Number1.000-0.018-0.077-0.001-0.099-0.162-0.1830.055-0.160-0.2280.025-0.0810.0070.0000.0020.0050.0040.0020.0100.0020.0070.0090.0000.0050.003
Monthly_Reporting_Period-0.0181.0000.512-0.0100.0020.0010.008-0.0290.0010.106-0.021-0.0370.0050.0000.0040.0080.0160.0050.0080.0060.0120.0130.0020.0100.007
Current_Loan_Delinquency_Status-0.0770.5121.0000.0230.1770.2650.0860.1640.2640.4850.0530.7990.1660.0180.0450.0660.0530.0340.1930.0330.0610.0640.0500.0900.154
Loan_Age-0.001-0.0100.0231.000-0.030-0.0500.0300.084-0.0490.0070.1620.0360.0230.0500.0410.0730.5610.1100.0470.0160.0620.0610.0290.0390.045
Remaining_Months_to_Legal_Maturity-0.0990.0020.177-0.0301.0000.6590.0680.0690.6610.069-0.0590.2120.4770.0290.0870.0490.0480.0370.3650.0320.0460.0420.3920.1710.763
Zero_Balance_Code-0.1620.0010.265-0.0500.6591.0000.0990.1590.9930.208-0.1180.3200.0260.0050.0030.0060.0140.0020.0120.0040.0190.0170.1030.0060.014
Zero_Balance_Effective_Date-0.1830.0080.0860.0300.0680.0991.0000.0620.0980.1300.0450.0980.0000.0060.0190.0020.0000.0000.0020.0040.0020.0010.0000.0040.001
Current_Interest_Rate0.055-0.0290.1640.0840.0690.1590.0621.0000.152-0.0640.1590.2180.0320.1190.0740.1170.1640.0740.0550.0680.0590.0750.0740.0720.060
Current_Deferred_UPB-0.1600.0010.264-0.0490.6610.9930.0980.1521.0000.208-0.1180.3190.5000.0350.0910.0540.0810.0420.4030.0360.0500.0490.1780.2020.706
Due_Date_of_Last_Paid_Installment_(DDLPI)-0.2280.1060.4850.0070.0690.2080.130-0.0640.2081.0000.0230.5020.0670.0830.2390.0620.0400.0310.1520.0250.0750.0820.0490.1160.088
Maintenance_and_Preservation_Costs0.025-0.0210.0530.162-0.059-0.1180.0450.159-0.1180.0231.0000.0790.0570.0720.0770.1420.9480.1740.1010.0340.1010.1090.0570.0750.084
Actual_Loss_Calculation-0.081-0.0370.7990.0360.2120.3200.0980.2180.3190.5020.0791.0000.1660.0180.0450.0660.0530.0340.1920.0330.0610.0640.0500.0900.154
Current_Actual_UPB0.0070.0050.1660.0230.4770.0260.0000.0320.5000.0670.0570.1661.0000.0160.0980.0650.0770.0920.5770.0660.1140.0990.2330.2490.414
Defect_Settlement_Date0.0000.0000.0180.0500.0290.0050.0060.1190.0350.0830.0720.0180.0161.0000.1860.0150.0920.0590.0120.0210.0250.0260.0070.0750.031
Modification_Flag0.0020.0040.0450.0410.0870.0030.0190.0740.0910.2390.0770.0450.0980.1861.0000.0270.1050.0780.1000.0380.0750.0800.0430.1640.074
MI_Recoveries0.0050.0080.0660.0730.0490.0060.0020.1170.0540.0620.1420.0660.0650.0150.0271.0000.1870.0520.0670.0180.6000.5810.0500.0680.040
Expenses0.0040.0160.0530.5610.0480.0140.0000.1640.0810.0400.9480.0530.0770.0920.1050.1871.0000.2300.1250.0450.1020.0960.0990.1060.093
Legal_Costs0.0020.0050.0340.1100.0370.0020.0000.0740.0420.0310.1740.0340.0920.0590.0780.0520.2301.0000.1060.0460.0720.0690.0460.0510.046
Miscellaneous_Expenses0.0100.0080.1930.0470.3650.0120.0020.0550.4030.1520.1010.1920.5770.0120.1000.0670.1250.1061.0000.0400.1730.1690.1110.3360.324
Modification_Cost0.0020.0060.0330.0160.0320.0040.0040.0680.0360.0250.0340.0330.0660.0210.0380.0180.0450.0460.0401.0000.0240.0230.0710.0170.039
Step_Modification_Flag0.0070.0120.0610.0620.0460.0190.0020.0590.0500.0750.1010.0610.1140.0250.0750.6000.1020.0720.1730.0241.0000.8640.1490.1400.078
Deferred_Payment_Plan0.0090.0130.0640.0610.0420.0170.0010.0750.0490.0820.1090.0640.0990.0260.0800.5810.0960.0690.1690.0230.8641.0000.1410.1330.075
Delinquent_Accrued_Interest0.0000.0020.0500.0290.3920.1030.0000.0740.1780.0490.0570.0500.2330.0070.0430.0500.0990.0460.1110.0710.1490.1411.0000.0490.151
Borrower_Assistance_Status_Code0.0050.0100.0900.0390.1710.0060.0040.0720.2020.1160.0750.0900.2490.0750.1640.0680.1060.0510.3360.0170.1400.1330.0491.0000.185
Interest_Bearing_UPB0.0030.0070.1540.0450.7630.0140.0010.0600.7060.0880.0840.1540.4140.0310.0740.0400.0930.0460.3240.0390.0780.0750.1510.1851.000

Missing values

2023-10-06T10:03:07.483830image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-06T10:03:11.557374image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-10-06T10:03:20.585067image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Loan_Sequence_NumberMonthly_Reporting_PeriodCurrent_Actual_UPBCurrent_Loan_Delinquency_StatusLoan_AgeRemaining_Months_to_Legal_MaturityDefect_Settlement_DateModification_FlagZero_Balance_CodeZero_Balance_Effective_DateCurrent_Interest_RateCurrent_Deferred_UPBDue_Date_of_Last_Paid_Installment_(DDLPI)MI_RecoveriesNet_Sales_ProceedsNon_MI_RecoveriesExpensesLegal_CostsMaintenance_and_Preservation_CostsTaxes_and_InsuranceMiscellaneous_ExpensesActual_Loss_CalculationModification_CostStep_Modification_FlagDeferred_Payment_PlanEstimated_Loan_to_Value_(ELTV)Zero_Balance_Removal_UPBDelinquent_Accrued_InterestDelinquency_Due_to_DisasterBorrower_Assistance_Status_CodeCurrent_Month_Modification_CostInterest_Bearing_UPB
0725202103Y20510215804.001P8030298000802.625CNFRMNJPU8000F20Q41275812P3601WELLS FARGO BANK, N.A.WELLS FARGO BANK, N.A.NaNNaN9NaN2N7
1737202103N205102NaN01P7427280000742.500RNFRMCOSF81200F21Q11275813C3602Other sellersU.S. BANK N.A.NaNNaN9NaN2N7
2744202105N20410448864.061P8534285000852.625RNFRMDESF19800F21Q11275814N2401Other sellersU.S. BANK N.A.NaNNaN9NaN1NY
3676202104N204103NaN01P7923146000793.000RNFRMKYSF40400F21Q11275815N2402Other sellersU.S. BANK N.A.NaNNaN9NaN2N7
4765202103N20360212700.001S6043312000602.125RNFRMMASF2600F21Q11275816N1802Other sellersOther servicersNaNNaN9NaN1N7
5686202103N20510219780.001P8039292000803.375RNFRMIASF50300F21Q11275817N3602Other sellersOther servicersNaNNaN9NaN2N7
6762202103N205102NaN01P8020352000802.625RNFRMCOSF81200F21Q11275818C3602Other sellersU.S. BANK N.A.NaNNaN9NaN2N7
7647202104N205103NaN01P8034339000803.500BNFRMKYSF40400F21Q11275819N3602Other sellersU.S. BANK N.A.NaNNaN9NaN1N7
8786202103N205102NaN01P7716215000772.750RNFRMVTSF5800F21Q11275820C3602Other sellersOther servicersNaNNaN9NaN2N7
9749202103Y20510228740.0301P9345149000933.000RNFRMNYCO12500F21Q11275821P3601Other sellersOther servicersNaNNaN9NaN2NN
Loan_Sequence_NumberMonthly_Reporting_PeriodCurrent_Actual_UPBCurrent_Loan_Delinquency_StatusLoan_AgeRemaining_Months_to_Legal_MaturityDefect_Settlement_DateModification_FlagZero_Balance_CodeZero_Balance_Effective_DateCurrent_Interest_RateCurrent_Deferred_UPBDue_Date_of_Last_Paid_Installment_(DDLPI)MI_RecoveriesNet_Sales_ProceedsNon_MI_RecoveriesExpensesLegal_CostsMaintenance_and_Preservation_CostsTaxes_and_InsuranceMiscellaneous_ExpensesActual_Loss_CalculationModification_CostStep_Modification_FlagDeferred_Payment_PlanEstimated_Loan_to_Value_(ELTV)Zero_Balance_Removal_UPBDelinquent_Accrued_InterestDelinquency_Due_to_DisasterBorrower_Assistance_Status_CodeCurrent_Month_Modification_CostInterest_Bearing_UPB
1219684704202210N20520925060.001P6943276000694.625RNFRMMSSF39400F21Q12497610N3602Other sellersPHH MORTGAGE CORPORATIONNaNNaN9NaN2N7
1219685807202208N20520338060.001P5820426000585.150RNFRMAZSF85200F21Q12497611N3561Other sellersOther servicersNaNNaN9NaN2N7
1219686816202207N20520131540.001P6038447000604.050RNFRMWISF53500F21Q12497612N3551Other sellersOther servicersNaNNaN9NaN2N7
1219687778202204N20520334060.001P391780000393.250RNFRMWVSF26500F21Q12497613N3602Other sellersOther servicersNaNNaN9NaN2N7
1219688666202209N20520835380.0251P9038408000903.750CNFRMLAPU70000F21Q12497614N3602JPMORGAN CHASE BANK, NATIONAL ASSOCIATIONJPMORGAN CHASE BANK, NATIONAL ASSOCIATIONNaNNaN9NaN2NN
1219689732202211N20521012940.001P7941252000793.125CNFRMLASF70800F21Q12497615N3601JPMORGAN CHASE BANK, NATIONAL ASSOCIATIONJPMORGAN CHASE BANK, NATIONAL ASSOCIATIONNaNNaN9NaN2N7
1219690803202203N20520214540.0301P9329204000933.250RNFRMKYSF42100F21Q12497616N3601Other sellersOther servicersNaNNaN9NaN2NN
1219691780202301N20521227260.001S6625510000666.500RNFRMFLPU32200F21Q12497618N3601Other sellersOther servicersNaNNaN9NaN2N7
1219692789202210N20520934940.001P5548189000553.125RNFRMFLSF34100F21Q12497619P3602CROSSCOUNTRY MORTGAGE, LLCCROSSCOUNTRY MORTGAGE, LLCNaNNaN9NaN2N7
1219693681202209N20520838940.0121P8533423000855.875RNFRMFLPU34900F21Q12497620N3602Other sellersOther servicersNaNNaN9NaN2NN